Mixed Bayesian Networks with Auxiliary Variables for Automatic Speech Recognition

نویسندگان

  • Todd A. Stephenson
  • Mathew Magimai-Doss
  • Hervé Bourlard
چکیده

In standard automatic speech recognition (ASR), hidden Markov models (HMMs) calculate their emission probabilities by an artificial neural network (ANN) or a Gaussian distribution conditioned only upon the hidden state variable. Recent work [12] showed the benefit of conditioning the emission distributions also upon a discrete auxiliary variable, which is observed in training and hidden in recognition. Related work [3] has shown the utility of conditioning the emission distributions on a continuous auxiliary variable. We apply mixed Bayesian networks (BNs) to extend these works by introducing a continuous auxiliary variable that is observed in training but is hidden in recognition. We find that an auxiliary pitch variable conditioned itself upon the hidden state can degrade performance unless the auxiliary variable is also hidden. The performance, furthermore, can be improved by making the auxiliary pitch variable independent of the hidden state.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling auxiliary information in Bayesian network based ASR

Automatic speech recognition bases its models on the acoustic features derived from the speech signal. Some have investigated replacing or supplementing these features with information that can not be precisely measured (articulator positions, pitch, gender, etc.) automatically. Consequently, automatic estimations of the desired information would be generated. This data can degrade performance ...

متن کامل

An Introduction to Bayesian Networks for Automatic Speech Recog

Bayesian Networks are a particular type of Graphical Models, providing a general and flexible framework to model, factor, and compute joint probability distributions among random variables in a compact and efficient way. For speech recognition, a BN permits each speech frame to be associated with an arbitrary set of random variables. They can be used to augment well-known statistical paradigms ...

متن کامل

Bayesian network structures and inference techniques for automatic speech recognition

This paper describes the theory and implementation of Bayesian networks in the context of automatic speech recognition. Bayesian networks provide a succinct and expressive graphical language for factoring joint probability distributions, and we begin by presenting the structures that are appropriate for doing speech recognition training and decoding. This approach is notable because it expresse...

متن کامل

Dynamic Bayesian Networks for Multi-Dialect Isolated Arabic Recognition

Hidden Markov Models (HMM) are currently widely used in Automatic Speech Recognition (ASR) as being the most effective models. In addition, the HMM are just a special case of graphical models which are dynamic Bayesian Networks (DBN). These are modeling tools more sophisticated because they allow to include several specific variables in the problem of automatic speech recognition other than the...

متن کامل

Investigating Mixed Discrete/Continuous Dynamic Bayesian Networks with Application to Automatic Speech Recognition

Notation s t The state of a discrete (switch) hidden variable at time t h t The state of a continuous hidden variable at time t o t A feature vector at time t v t A sample of the speech signal at time t x 1:T Shorthand for x 1 , x 2 ,. .. , x T φ A particular setting of the HMM parameters

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002